A novel indicator in epidemic monitoring through a case study of Ebola in West Africa (2014–2016)

The E/S (exposed/susceptible) ratio is analyzed in the SEIR model. The ratio plays a key role in understanding epidemic dynamics during the 2014–2016 Ebola outbreak in Sierra Leone and Guinea. The maximum value of the ratio occurs immediately before or after the time-dependent reproduction number (Rt) equals 1, depending on the initial susceptible population (S(0)). It is demonstrated that transmission rate curves corresponding to various incubation periods intersect at a single point referred to as the Cross Point (CP). At this point, the E/S ratio reaches an extremum, signifying a critical shift in transmission dynamics and aligning with the time when Rt approaches 1. By plotting transmission rate curves, β(t), for any two arbitrary incubation periods and tracking their intersections, we can trace CP over time. CP serves as an indicator of epidemic status, especially when Rt is close to 1. It provides a practical means of monitoring epidemics without prior knowledge of the incubation period. Through a case study, we estimate the transmission rate and reproduction number, identifying CP and Rt = 1 while examining the E/S ratio across various values of S(0).

www.nature.com/scientificreports/To simulate the process, we utilize data from the Ebola outbreak in Sierra Leone and Guinea.This outbreak began in Guinea in December 2013 and later spread to other West African countries, including Liberia and Sierra Leone, resulting in nearly 30,000 infections from 2014 to 2016 [19][20][21][22] .In this case study, we estimate the dates of occurrence for CP and R t = 1 and calculate the time difference between them.The time-dependent reproduction number is reached to one after a few days after CP, with its value being greater than one at the time of CP.Therefore, CP holds the potential to be used as a precautionary indicator signifying that the disease is nearing control.

Materials and methods
A population N is partitioned into compartments labeled S, E, I, and R, representing susceptible, exposed, infec- tious, and removed individuals, respectively.The model includes four parameters: time-dependent transmission rate (β(t)) , rate of progression from exposure to infection (σ ) , removal rate ( γ ), and case fatality rate ( f ) 23 .The SEIR model analyzed in this paper has several assumptions.Firstly, it assumes that the size of the total population remains constant: this assumption corresponds to the net input to the susceptible by births being equal to the net mortality 24 .This simplification is often adopted to focus solely on the dynamics of disease transmission without considering demographic changes.Secondly, the population is assumed to be homogeneous, implying that individuals within each compartment are considered identical in terms of susceptibility, exposure, infectiousness, and recovery 25 .Thirdly the population is assumed to be well-mixed, with individuals having an equal chance of coming into contact with any other individual [25][26][27] .This assumption facilitates modeling the spread of the disease in a population where interactions are random and frequent.Fourthly, once individuals recover from the infectious stage, they gain immunity and cannot be infected again, at least for some period 28 .
The transmission dynamics are described as follows: The point of d dt (E/S) = 0 and incubation periods For various incubation periods, the transmission rate curves pass through a single point, as depicted in Fig. 3a,c.This point of intersection is denoted as CP.It is proven that the point where d dt (E/S) = 0 coincides with the cross point.CP is independent of the incubation period (1/σ ) , and it can be easily estimated by plotting two transmis- sion rate curves for any two incubation periods.

Theorem 1
The transmission rate (β(t)) shares a single common point where d dt (E/S) = 0 .At this point, the value of β(t) is independent of σ 29 .
Then, From Eq. ( 1) one has and Thus, β(t) is independent of σ when  To compare the times of d dt (E/S) = 0 and R t = 1 , the value of R t is investigated when d dt (E/S) = 0 .The time- dependent reproduction number can be expressed as 23 , By rearranging the second equation in Eq. ( 1), the transmission rate can be written as: Then, R t can be written as: Since SE ′ − ES ′ = 0 at the point where d dt (E/S) = 0 , the time-dependent reproduction number at CP can be written as: The value of R t at CP can be used to estimate which event occurs first: CP or R t = 1.There are 3 cases of R t values at CP: In the SEIR model, σ E represents the number of newly infected people entering compartment I , and γI represents the number of infected people leaving compartment I .The ratio of σ E γI is entirely dependent on the variation in compartment I .It is always greater than one when dI/dt > 0 , as indicated by Eq. (1).Additionally, the ratio E/S also plays a crucial role in determining the value of R t at d dt (E/S) = 0.At this point, E/S reaches a maximum while 1 (1+E/S) attains a minimum value.When S(0) is sufficiently large so that S ≫ E , CP always appears earlier than R t = 1 .Therefore, CP can be a precautionary indicator that R t = 1 is imminent.Assuming that the initial susceptible population S(0) is very small, the effect of E/S cannot be ignored in Eq. ( 5), and CP can appear after R t = 1 .Although the appearance time of CP and R t = 1 is dependent on the value of E/S , they are very close for usual cases between dE/dt = 0 and dI/dt = 0 , where σ E γI ≈ 1 and 1 (1+E/S) ≈ 1 .Therefore, CP can be an alternative indicator to R t = 1 , suggesting that the epidemic is almost under control when d dt (E/S) = 0.

Case study
In this study, we utilized the data of cumulative cases and deaths from the Ebola outbreak in Guinea and Sierra Leone, sourced from the World Health Organization

Cumulative case and death data
The time-dependent transmission rate and reproduction number in the SEIR model are estimated using only two regression functions for the cumulative cases and deaths data.Two equations, dC dt and dD dt , are added to the SEIR system in Eq. ( 6).Here, C represents cumulative cases and D represents disease-induced deaths.The inclusion of C and D in the SEIR system does not alter its dynamics; it simply accounts for the exposed and deceased population.Data fitting is conducted using the logistic equation as the base function 21 .

Procedure to construct the transmission rate
We obtain regression functions by curve fitting the cumulative data of cases and deaths.Then, the values of γ and f are calculated using the linear least square method in Eqs. ( 8) and ( 9).All variables I,R , E and S are determined, and the time-dependent transmission rate ( β(t) ) is constructed for various incubation periods, after which the time-dependent reproduction rate (R t ) is obtained.The overall procedure shown in Fig. 1 summarizes the algorithm to obtain β(t).
C and D represent curve-fitted data of the cumulative cases and deaths.By solving the inverse problem con- sisting of Eqs. ( 1) and ( 6), we estimate γ , f , I , R , E and S , and then β(t).

Curve fitting of C and D
To estimate the values of parameters for the transmission dynamics Eq. ( 1), we fit the model to the cumulative data of cases (C) and deaths (D) .Coefficients a, b, c and d are obtained from fitting the solutions of Eq. ( 7).We used a logistic function for regression with the Levenberg-Marquardt method 31,32 in MATLAB 33 , as it is convenient and suitable for describing the proposed method and approximates very well.For Guinea, the R-squared values are 0.9999 for cases and 0.9999 for deaths in Fig. 2a.For Sierra Leone, the R-squared values are 0.9478 for cases and 0.9998 for deaths, as shown in Fig. 2b.

Estimation of removal rate (γ ) and fatality (f )
We assume that the total population size N = S + E + I + R + D in each country is constant and that the initial value of R(0) = 0.

It is rewritten as
The parameters f and γ are estimated using the linear least square method or pseudoinverse.
For the simulation, we choose a data set from the beginning to various days during the Ebola outbreak 34 .The mean infectious time of Guinea is listed in Table 1.The mean infectious time is 10.46 days or 9.90 days if we use the dataset of 1-200 or 1-240 corresponding to October 16, 2015 and December 18, 2015, respectively.The simulation uses 10.46 days as the mean infectious time.For Sierra Leone, we take the mean infectious time as 10.14 days estimated by the data up to August 7, 2015.The fatality rate in Guinea is 65.86% and in Sierra Leone is 30.50%, and there is no significant change in either country from beginning to end.( 7)

Construction of β(t)
The values of S, E, I and R are determined according to Eqs. ( 10)- (15).The parameters f and γ are estimated using the linear least square method, as detailed in Table 1.

Defining we have
Integrating dS dt in Eq. ( 1), we obtain The initial value Putting S into φ(t) , we have

Transmission rate curves and the date of CP
The incubation period is defined as the interval between exposure to a pathogen and the initial occurrence of symptoms and signs 35 .The mean incubation period of Ebola virus disease ranges from 2 to 21 days depending on simulation methods, data and country 20,23,35 .For three different incubation periods, the time-dependent transmission rates are calculated.The first incubation period is 5.3 days according to the Ebola virus in Congo 23 .The second one is 11.4 days, based on data from the WHO Ebola Response Team 20 , and the third one is the maximum incubation days 35 .Figure 3a,c show the estimated transmission rates with various lengths of incubation periods ( 1/σ ).The greater the value of the incubation period the greater the maximum value of the transmission rate and its decay rate.Additionally, one can also observe that the transmission rate curves intersect at a single point for three incubation periods, as shown in Theorem 1.

Time comparison of CP and R t = 1 points
Table 2 shows the values of E/S , σ E γI , and R t in Eq. ( 5) at CP.In both countries, R t > 1 at CP, indicating that CP is expected to appear earlier than R t = 1 .For Guinea, R t = 1 appears on the 241st and CP on the 237th day from March 25, 2014.For Sierra Leone, it appears on the 189th day and CP on the 185th day from May 27, 2014 referring to Table four.The reference dates correspond to the time when R t = 1 in Fig. 3 are approximately nearby November 20, 2014 for Guinea and December 2, 2014 for Sierra Leone for the incubation period of 11.4 days (see Table 3).CP is very close to the date of R t = 1 and is independent of the incubation period.Therefore, CP Figure 3. Transmission rate (β(t)) for various lengths of incubation period and reproduction rate R t .(a, c) The transmission rate for various incubation periods ( 1/σ) are shown for Guinea and Sierra Leone.All transmission curves intersect at a single point (CP), which is independent of the incubation period.As the incubation period increases, the transmission rate exhibits a higher maximum value and decays more rapidly thereafter.CP (237, 0.1076) and CP (185, 0.1003) indicate that the transmission rate curves intersect at a common point on the 237th day with a transmission value of 0.1076 for Guinea and on the 185th day with a value of 0.1003 for Sierra Leone.(b, d) the plots depict β(t) and R t for incubation periods of 11.4 days for Guinea and 10.14 days for Sierra Leone.In both cases, CP appears approximately 4 days earlier than R t = 1.

Table 2.
Information on E/S , σ E γI , and R t = 1 at CP for Guinea and Sierra Leone.The time-dependent reproduction number (R t ) at CP is 1.011 for Guinea and 1.028 for Sierra Leone.Additionally, the other values of factors are σ E γI = 1.0267 , www.nature.com/scientificreports/can be an alternative indicator suggesting that the disease is very close to being under control, R t ≈ 1 , even when the incubation period is unknown or estimated with high uncertainty.Table 3 presents the time sequence in which four events occur, including CP, R t = 1 , and the maximum values of E(t) and I(t).Using Eqs. ( 1) and ( 2), R t = 1 can be expressed as R t = 1 occurs only a few days after passing the time point of CP, as shown in Fig. 3b,d.

Informations about the reported data and its corresponding dates
Table 4 shows the data indicated by the inflection point obtained from the regression curve in the reported data points.On November 12, the cumulative number of confirmed cases in Guinea jumped from 1878 to 1919 in two days, which corresponds to an index from 63 to 64 in Table 4.The number of inflection points from the regression is between 1894 and 1901.This means that dE/dt = 0 occurs between November 12 and 14, 2014.In Sierra Leone, the inflection point from the regression is between 6329 and 6667.Hence, dE/dt = 0 occurs between November 21 and 27, 2014.The calendar date that corresponds to CP is November 15-16 for Guinea and November 26-27 for Sierra Leone.The days of dI/dt = 0 are November 24-25 and December 6-7.The time-dependent reproduction number is nearly 1 on November 20-22 for Guinea and December 2-4 for Sierra Leone.The time-dependent reproduction number ( R t ) at CP was 1.011 for Guinea and 1.028 for Sierra Leone, as shown in Table 2.The index refers to the order in which the number of cases was reported from March 25, 2014 for Guinea and May 27, 2014 for Sierra Leone.

The ratio of E/S for various S(0)
Figure 4a,c illustrate the estimated E/S ratios for four S(0) ranging from 20,000 to 80,000 for Guinea and Sierra Leone.In Guinea, when S(0) is 20,000, the maximum value of E/S occurs around the 243rd day.When S(0) is 60,000, it occurs around the 239th day, and for 80,000, it occurs approximately on the 238th day, as depicted in Fig. 4a.For Sierra Leone, when S(0) is 20,000, the maximum value of E/S occurs around the 210th day.When S(0) is 80,000, the maximum value of E/S occurs on the 189th day, as shown in Fig. 4c.The extreme values of the E/S ratios are traced from the E/S curves and CP in Fig. 4b,d.These extreme values are estimated point- wise from the E/S ratio curves, while the Cross Points are calculated from two transmission rate curves.Since dE/dt = 0 and dI/dt = 0 can be estimated through Eqs. ( 10)-( 12), the time of R t = 1 estimated from Eq. ( 16) is ( 16) dE/dt + dI/dt = 0 Table 3.The dates of CP and R t = 1.It indicates the dates when CP and R t = 1 appear between dE/dt = 0 and dI/dt = 0 .The calendar date that corresponds to the cross point is November 15-16 for Guinea and November 27-28 for Sierra Leone.CP is reached approximately one week before the time-dependent reproduction rate decreases to one for 1/σ = 11.4 not affected by S(0).When S(0) is 20,000, R t = 1 appears before the maximum value of E/S .Until S(0) reaches 30,000 for Guinea and 60,000 for Sierra Leone, R t = 1 appears before the maximum value of E/S .For S(0) over 100,000, the maximum value of E/S converges to the 237th day for Guinea and the 185th day for Sierra Leone, as shown in Fig. 4b,d.As S(0) increases beyond 80,000, the maximum value of E/S appears before R t = 1 for both countries in Fig. 4b,d.

Discussion
By solving the inverse problem, the time-dependent transmission rate is estimated using the cumulative data of Ebola outbreaks in Sierra Leone and Guinea between 2014 and 2016, and by rearranging the differential equation system of the SEIR model.The logistic equation (Eq.7) fits very well to the data, as shown in Fig. 2, and it is very useful to explain the proposed algorithm.However, other statistical methods such as the adaptive Metropolis-Hastings (M-H) algorithm for the Bayesian Markov Chain Monte Carlo (MCMC) procedure can also be used 36 .After obtaining the appropriate regression function, the variables ( S , E , I and R ) and parameters ( f , γ ) of the system can be found easily by the inverse method.Although the mean infectious time can be selected from other references 19,36 , the mean infectious time is estimated using cumulative data by its pseudoinverse (Eqs.8,  9).If parameters ( f , γ ) are given, then Eqs. ( 1) and ( 2) are enough to estimate R t , E/S , and CP.Tracing the extreme points of E/S ratio and CPs for various cases of S(0) from 20,000 to 200,000.(a, c) Several E/S ratios are estimated for S(0) = 20,000 to 80,000 in increments of 20,000 units for Guinea and Sierra Leone.The maximum value appears near the 243rd day for S(0) = 20,000 and the 238th day for S(0) = 80,000 for Guinea.For Sierra Leone, it appears near the 210th day for S(0) = 20,000 and the 189th day for S(0) = 80,000 (b, d) (above): These graphs illustrate the dates when the maximum values of E/S appear as the value of S( 0 We can track the E/S ratio by tracking the distance between the two transmission rate curves for the two incubation periods.If the distance between two transmission rate curves for the two incubation periods does not decrease, the infectious disease is continuously spreading, so quarantine measures must be further strengthened.At CP the difference becomes 0, it means that quarantine measures are being implemented appropriately. The existence of CP is inherent in the SEIR model and, therefore, does not depend on data regression methods.From CP, it is observed that the transmission rate for a longer incubation period is lower than that for a shorter incubation period.The point at which d dt (E/S) = 0 , or CP, the moment when the transmission rate shifts to a less transmissible rate for a longer incubation period.
The length of the incubation period varies greatly depending on the initial infection dose, the rate of pathogen replication, and the defense mechanisms within the host 37,38 .The possibility that the characteristics of the pathogen changed before and after CP cannot be ruled out.At CP or near R t = 1 , the amount or pattern of replication of the pathogen in the host immune system may change.It is necessary to study changes in the incubation period depending on the characteristics of the pathogen within the host.
The accuracy of estimating the date of R t = 1 depends entirely on the precision of creating the regression function, which relies on different datasets, such as those up to August or September 2014.In this case study, the date of R t = 1 is estimated using the equation dE/dt + dI/dt = 0 , derived from Eqs. ( 1) and ( 2).The inflection points of cumulative data, such as when dE/dt = 0 and dI/dt = 0 , appeared only a few days apart, as shown in Table 3. CP and R t = 1 occur between dE/dt = 0 and dI/dt = 0 , which means that the occurrence times of CP and R t = 1 are very close.Although the timing of CP and R t = 1 depends on the value of E/S in Eq. ( 5), they are very close to each other.The time-dependent reproduction number reaches one a few days after passing CP, indicating that the reproduction number is still greater than one at the time of CP.However, we can infer from CP that the epidemic will begin to decline within a few days.Thus, CP can be considered a new indicator that the epidemic is nearly under control.Moreover, since CP is not affected by the incubation period, it has the potential to serve as a criterion that can replace R t = 1 when there is uncertainty about the length of the incubation period.
The value of S(0) is a crucial factor that determines the temporal relationship between R t = 1 and CP.When S(0) is set to over 80,000, CP consistently appears earlier than R t = 1 for both countries, serving as a precaution- ary indicator that R t = 1 is imminent.However, assuming S(0) is small, such as 20,000, CP may appear after R t = 1 , as shown in Fig. 4. In this case study, S(0) is assumed to be close to the total population, exceeding mil- lions for both countries.Consequently, CP is expected to appear earlier than R t = 1 , and this is also confirmed in the simulation.

Conclusion
In solving the inverse problem of SEIR, we prove that transmission rate curves for various incubation periods intersect at a single point, denoted as CP (Cross Point), where d dt (E/S) = 0 .The extreme value of the ratio E/S occurs immediately before or immediately after R t = 1 , depending on S(0).Therefore, in CP, R t = 1 is very close, so we can expect the epidemic to stabilize soon.The E/S value can be estimated using incidence data or cumula- tive data in the inverse method, when the mean generation time and S(0) are given.Then the extreme value of E/S can be traceable.However, tracing the CP is more convenient.By plotting transmission rate curves, β(t), for any two arbitrary incubation periods and tracking where they intersect, we can trace CP in time.Since CP is obtained using a random incubation period, accurate incubation period information is not required to find the extreme point of the ratio of E/S .Tracking E/S ratio through other methods such as stochastic and artificial intelligence can be useful to predict and estimate the states of the epidemic.If S(t) is controlled by an effective vaccine or appropriate interventions, CP can be reached very quickly.This would be one way to get R t = 1 quickly.

Figure 1 .
Figure 1.Procedure for estimating the time-dependent transmission rate.

Figure 2 .
Figure 2. Curve fitting to the cumulative case and death data of Guinea (a) and Sierra Leone (b).The Levenberg-Marquardt method with 95% confidence bounds is used.R-squared values exceeding 0.948 are obtained for all cases (Guinea cases; 0.9999, deaths; 0.9999 and Sierra Leone cases; 0.9478, deaths; 0.9998).The curve fitting utilizes data points up to October 18, 2015 for Guinea and up to August 7, 2015 for Sierra Leone.

Figure 4 .
Figure 4.Tracing the extreme points of E/S ratio and CPs for various cases of S(0) from 20,000 to 200,000.(a, c) Several E/S ratios are estimated for S(0) = 20,000 to 80,000 in increments of 20,000 units for Guinea and Sierra Leone.The maximum value appears near the 243rd day for S(0) = 20,000 and the 238th day for S(0) = 80,000 for Guinea.For Sierra Leone, it appears near the 210th day for S(0) = 20,000 and the 189th day for S(0) = 80,000 (b, d) (above): These graphs illustrate the dates when the maximum values of E/S appear as the value of S(0) changes from 20,000 to 200,000.The days on which the E/S values reach their maximum are March 25, 2014 for Guinea and May 27, 2014, for Sierra Leone.They converge to the 237th day for Guinea and the 185th day for Sierra Leone.(b, d) (below): The Cross Points are calculated from the two transmission rate curves.R t = 1 is constant at 241st day for Guinea and 189th day for Sierra Leone for S(0).
Figure 4.Tracing the extreme points of E/S ratio and CPs for various cases of S(0) from 20,000 to 200,000.(a, c) Several E/S ratios are estimated for S(0) = 20,000 to 80,000 in increments of 20,000 units for Guinea and Sierra Leone.The maximum value appears near the 243rd day for S(0) = 20,000 and the 238th day for S(0) = 80,000 for Guinea.For Sierra Leone, it appears near the 210th day for S(0) = 20,000 and the 189th day for S(0) = 80,000 (b, d) (above): These graphs illustrate the dates when the maximum values of E/S appear as the value of S(0) changes from 20,000 to 200,000.The days on which the E/S values reach their maximum are March 25, 2014 for Guinea and May 27, 2014, for Sierra Leone.They converge to the 237th day for Guinea and the 185th day for Sierra Leone.(b, d) (below): The Cross Points are calculated from the two transmission rate curves.R t = 1 is constant at 241st day for Guinea and 189th day for Sierra Leone for S(0).

Table 1 .
Mean infectious times and fatality for various sets of data.The index represents the reported data number starting on March 25, 2014 for Guinea and May 27, 2014 for Sierra Leone.The mean infectious time (γ) for Guinea does not vary much after the 200th data point of October 16, 2015.The simulation uses 10.46 days as the mean infectious time.For Sierra Leone, the mean infectious time appears to decrease for a large data set but then starts to increase after index 170.Considering the mean infectious time in Guinea, it is picked at the index 150 where it is estimated by 10.14 days.The fatality rates in Guinea are between 65 and 66%, and those in Sierra Leone are between 30 and 31%.

Table 4 .
Reference dates for incidents.Data points are not uniform.Data points are numbered from the beginning of March 25, 2014 for Guinea and May 27, 2014 for Sierra Leone.The index indicates the order in which the number of patients was reported.